An Efficient Method To Synthesize Chinese Speech With Speaker Style

نویسندگان

  • Zhong-ke Ma
  • Wei Li
  • Deyu Xia
  • Ren-Hua Wang
چکیده

Speaker Style Zhong-ke Ma Wei Li Deyu Xia Ren-Hua Wang Department of Electronic Engineering & Information Science University of Science & Technology of China P.O.Box 4, Hefei 230027 Email: [email protected] Tel: 0551 3603645 ABSTRACT This paper introduces a corpus-based Chinese speech synthesis method, which can produce Chinese speech with the style of original speaker who records the corpus. There are two major problems in speech synthesis based on corpus. First, what contents should be kept in the corpus? Second, given a target sentence, how to select the synthesis units in corpus? Focusing on these two questions, we present our solution. 1.INTRODUCTION Now, it is not so difficult to synthesize Chinese speech with high naturalness using limited number of instances for each elemental unit, but the synthesized speech usually loses the speaking style of the speaker who records these instances. The two reasons for this problem are analyzed as below. First, the instances of the elemental units are often got from non-sense words that are designed especially. So, the prosody of these instances are different from that of actual speech. Second, the knowledge about the prosody of actual speech is not enough, so the target prosody produced by prosody production module in speech synthesis system is different from actual speech. Because the naturalness and speech style of Chinese are determined mainly by speech prosody, these two differences make the synthesized speech lose the origin style.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integration of Intonation in Trainable Speech Synthesis

Current developments in artificial speech synthesis place more emphasis on spectral continuities and diverse prosodic effects. The trainable HMM-based speech synthesis method has generated more continuous spectral structure than unit selection method in recent study, but the pitch contour generated by HMM-based method trends to be over-smoothed and lacks syllable variance in Chinese. In this pa...

متن کامل

Constructing stylistic synthesis databases from audio books

In this paper, we explore how to construct stylistic TTS databases from audio books, in which a storyteller performs multiple roles. The goal is to identify and build a set of speech corpora, each of which not only portrays a representative voice style performed by the speaker, but also has sufficient sentences to synthesize natural speech using unit selection approach. We solve the problem in ...

متن کامل

HMM based TTS for mixed language text

When synthesizing Chinese text mixed with English text, it is usually preferred to synthesize the mixed languages content with a single voice. However the synthesized English of HMM based TTS may sound unnatural if the models are directly built with a Chinese speakers’ unprofessional English data. In this paper, we propose to use MLLR speaker adaptation method to leverage a native English speak...

متن کامل

Turning a Monolingual Speaker into Multilingual for a Mixed-language TTS

We propose a new approach to rendering speech of different languages with only a speaker’s monolingual recordings for mixed-code TTS applications. A reference speaker in the target language (say Chinese) is used to help building the target language TTS with “tiles” of the original source speaker‘s monolingual (say English) data. The difference between the monolingual source speaker and the refe...

متن کامل

A corpus-based Chinese speech synthesis with contextual dependent unit selection

This paper describes the realization of a corpus-based Chinese speech synthesis system, including the corpus design and unit selection procedure. The system selects the synthesis unit according to context similarity between target unit and candidate unit. Neither prosody parameter prediction nor prosody feature modification is needed. The informal test shows that the synthesized speech is quite...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000